Goto

Collaborating Authors

 wolfram alpha


Testing GPT-4-o1-preview on math and science problems: A follow-up study

Davis, Ernest

arXiv.org Artificial Intelligence

In August 2023, Scott Aaronson and I reported the results of testing GPT4 with the Wolfram Alpha and Code Interpreter plug-ins over a collection of 105 original high-school level and college-level science and math problems (Davis and Aaronson, 2023). In September 2024, I tested the recently released model GPT-4o1-preview on the same collection. Overall I found that performance had significantly improved, but was still considerably short of perfect. In particular, problems that involve spatial reasoning are often stumbling blocks. On September 12, OpenAI (2024) released two preliminary versions, "ChatGPT-o1-preview" and "ChatGPT-o1-mini" of a forthcoming product "ChatGPT-o1".


Testing GPT-4 with Wolfram Alpha and Code Interpreter plug-ins on math and science problems

Davis, Ernest, Aaronson, Scott

arXiv.org Artificial Intelligence

Our test sets were too small and too haphazard to support statistically valid conclusions, but they were suggestive of a number of conclusions. We summarize these here, and discuss them at greater length in section 7. Over the kinds of problems tested, GPT-4 with either plug-in is significantly stronger than GPT-4 by itself, or, almost certainly, than any AI that existed a year ago. However it is still far from reliable; it often outputs a wrong answer or fails to output any answer. In terms of overall score, we would judge that these systems performs on the level of a middling undergraduate student. However, their capacities and weaknesses do not align with a human student; the systems solve some problems that even capable students would find challenging, whereas they fail on some problems that even middling high school students would find easy.


Learnings from Data Integration for Augmented Language Models

Halevy, Alon, Dwivedi-Yu, Jane

arXiv.org Artificial Intelligence

One of the limitations of large language models is that they do not have access to up-to-date, proprietary or personal data. As a result, there are multiple efforts to extend language models with techniques for accessing external data. In that sense, LLMs share the vision of data integration systems whose goal is to provide seamless access to a large collection of heterogeneous data sources. While the details and the techniques of LLMs differ greatly from those of data integration, this paper shows that some of the lessons learned from research on data integration can elucidate the research path we are conducting today on language models.


OpenAI rolls out ChatGPT plugins for third parties • The Register

#artificialintelligence

Analysis OpenAI this week introduced ChatGPT plugins, a way to extend the scope of its chatbot language model beyond the slurry of internet training data to bespoke business information. So wary is OpenAI of all the ways that ChatGPT and its other models can misfire that the company begins its announcement by reassuring readers that its cautious rollout follows from its desire to address "safety and alignment challenges." It does so with good reason – large language models (LLMs), referred to euphemistically as artificial intelligence or just AI, are seen by some to be venomous constructs that must be contained. LLMs are also limited to whatever information can be accessed or derived from their training data. As OpenAI puts it, "This information can be out-of-date and is one-size fits all across applications. Furthermore, the only thing language models can do out-of-the-box is emit text. This text can contain useful instructions, but to actually follow these instructions you need another process."


Wolfram

#artificialintelligence

It happened to us with Wolfram Alpha back in 2009. It happened with our Physics Project in 2020. I've been tracking neural net technology for a long time (about 43 years, actually). And even having watched developments in the past few years I find the performance of ChatGPT thoroughly remarkable. Finally, and suddenly, here's a system that can successfully generate text about almost anything--that's very comparable to what humans might write.


Build Your Next Project with Wolfram Alpha API and Python

#artificialintelligence

Anybody who at some point struggled with math knows Wolfram Alpha and was probably saved by its ability to solve any equation, plot any function or visualize logic circuits. Wolfram Alpha can; however, do much more than that, including chemistry, linguistics or history and most importantly -- it can give you all the answers using its public API. So, in this article we will explore how you can use it to answer simple questions, solve mathematical problems, render plots or even describe DNA sequences! Wolfram Alpha API is free (for non-commercial usage), but we still need to get API key (AppID) to perform queries against the API endpoints. The above code uses Wolfram Alpha Full Results API to find out what is lifespan of mosquito.


Stephen Wolfram on the future of programming and why we live in a computational universe

#artificialintelligence

This article originally appeared on TechRepublic. When it came to figuring out which computer scientist should help linguists decipher inscrutable alien texts, it was Stephen Wolfram who got the call. Sure, these extraterrestrials may only have existed in the sci-fi movie Arrival, but if ET ever does drop out of orbit, Wolfram might well still be on the short list of people to contact. Download this article as a PDF (free registration required). The British-born computer scientist's life is littered with exceptional achievements -- completing a PhD in theoretical physics at Caltech at age 20, winning a MacArthur Genius Grant at 21, and creating the technical computing platform Mathematica (which is used by millions of mathematicians, scientists, and engineers worldwide), plus the Wolfram Language, and the Wolfram Alpha knowledge engine.


The Ease of Wolfram Alpha, the Power of Mathematica: Introducing Wolfram

#artificialintelligence

Wolfram Alpha has been a huge hit with students. Whether in college or high school, Wolfram Alpha has become a ubiquitous way for students to get answers. But it's a one-shot process: a student enters the question they want to ask (say in math) and Wolfram Alpha gives them the (usually richly contextualized) answer. It's incredibly useful--especially when coupled with its step-by-step solution capabilities. But what if one doesn't want just a one-shot answer? What if one wants to build up (or work through) a whole computation?


Wolfram gives developers free access to the engine that powers its technology stack

#artificialintelligence

Wolfram Research today announced free access to the engine that powers its technology stack. The Wolfram Engine is available to developers for free, assuming it is used for non-production development. Wolfram Research is best known for creating the modern technical computing system Mathematica and the computational knowledge engine Wolfram Alpha (stylized Wolfram Alpha). Founded by computer scientist Stephen Wolfram, the company celebrated the 10-year anniversary of Wolfram Alpha just last week. "The Wolfram Engine is the heart of all our products," Stephen Wolfram explains.


AI Weekly: Despite fears of job-stealing robots, AI did a lot of good this year

#artificialintelligence

In case you somehow missed the dire predictions regarding artificial intelligence: AI is coming for our jobs. Over 10 percent of positions currently occupied by humans will be eliminated by cheaper, more efficient automated replacements, and experts agree AI could make redundant as many as 75 million jobs by 2025. With new reports sounding the alarm bells on what seems a daily basis, it's all too easy to get caught up in the negativity. But as we reflect back on a few of AI's achievements in 2018, I'd argue it's tough not to be encouraged by the good it can do -- specifically, the ways AI can augment skilled humans. Recall Unanimous AI, a startup headquartered in San Francisco and founded by Stanford-educated computer scientist and CEO Louis Rosenberg.